Search CORE

24 research outputs found

HiTRACE: High-throughput robust analysis for capillary electrophoresis

Author: Bylund
Cormen
Cover
Das
Das
Deigan
Ewing
Ewing
Hanjoo Kim
Jinkyu Kim
Justine Hum
Kay
Kazmi
Kladwang
Kladwang
Laederach
Levenberg
Marquardt
Merino
Mitra
Nielsen
Oppenheim
Peattie
Pravdova
Rhiju Das
Robinson
Ruiz-Martinez
Seunghyun Park
Sungroh Yoon
Tijerina
Tomasi
Vasa
Walczak
Watts
Weeks
Wilkinson
Wipapat Kladwang
Wong
Woolley
Xi
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2011
Field of study

Motivation: Capillary electrophoresis (CE) of nucleic acids is a workhorse technology underlying high-throughput genome analysis and large-scale chemical mapping for nucleic acid structural inference. Despite the wide availability of CE-based instruments, there remain challenges in leveraging their full power for quantitative analysis of RNA and DNA structure, thermodynamics, and kinetics. In particular, the slow rate and poor automation of available analysis tools have bottlenecked a new generation of studies involving hundreds of CE profiles per experiment. Results: We propose a computational method called high-throughput robust analysis for capillary electrophoresis (HiTRACE) to automate the key tasks in large-scale nucleic acid CE analysis, including the profile alignment that has heretofore been a rate-limiting step in the highest throughput experiments. We illustrate the application of HiTRACE on thirteen data sets representing 4 different RNAs, three chemical modification strategies, and up to 480 single mutant variants; the largest data sets each include 87,360 bands. By applying a series of robust dynamic programming algorithms, HiTRACE outperforms prior tools in terms of alignment and fitting quality, as assessed by measures including the correlation between quantified band intensities between replicate data sets. Furthermore, while the smallest of these data sets required 7 to 10 hours of manual intervention using prior approaches, HiTRACE quantitation of even the largest data sets herein was achieved in 3 to 12 minutes. The HiTRACE method therefore resolves a critical barrier to the efficient and accurate analysis of nucleic acid structure in experiments involving tens of thousands of electrophoretic bands.Comment: Revised to include Supplement. Availability: HiTRACE is freely available for download at http://hitrace.stanford.ed

arXiv.org e-Print Archive

CiteSeerX

Crossref

Understanding the errors of SHAPE-directed RNA structure modeling

Author: Aviran S.
Brunel C.
Butler E. B.
Byrne R. T.
Cate J. H.
Christopher C. VanLang
Collins K.
Correll C. C.
Cruz J. A.
Darty K.
Das R.
Das R.
Deigan K. E.
Efron B.
Efron B.
Felsenstein J.
Gesteland R. F.
Hickey D. R.
Hofacker I. L.
Kladwang W.
Kladwang W.
Kulshina N.
Kwon M.
Lemay J. F.
Leontis N. B.
Levitt M.
Mandal M.
Mandal M.
Mathews D. H.
Mathews D. H.
Mills D. R.
Mitra S.
Mortimer S. A.
Noller H. F.
Pablo Cordero
Panning B.
Pedersen J. S.
Regulski E. E.
Rhiju Das
Russell R.
Russell R.
Serganov A.
Smith K. D.
Staley J. P.
Sudarsan N.
Sussman J. L.
Takamoto K.
Vasa S. M.
Watts J. M.
Wilkinson K. A.
Wilkinson K. A.
Winkler W. C.
Wipapat Kladwang
Yoon S.
Publication venue: 'American Chemical Society (ACS)'
Publication date: 07/09/2011
Field of study

Single-nucleotide-resolution chemical mapping for structured RNA is being rapidly advanced by new chemistries, faster readouts, and coupling to computational algorithms. Recent tests have shown that selective 2'-hydroxyl acylation by primer extension (SHAPE) can give near-zero error rates (0-2%) in modeling the helices of RNA secondary structure. Here, we benchmark the method using six molecules for which crystallographic data are available: tRNA(phe) and 5S rRNA from Escherichia coli, the P4-P6 domain of the Tetrahymena group I ribozyme, and ligand-bound domains from riboswitches for adenine, cyclic di-GMP, and glycine. SHAPE-directed modeling of these highly structured RNAs gave an overall false negative rate (FNR) of 17% and a false discovery rate (FDR) of 21%, with at least one helix prediction error in five of the six cases. Extensive variations of data processing, normalization, and modeling parameters did not significantly mitigate modeling errors. Only one varation, filtering out data collected with deoxyinosine triphosphate during primer extension, gave a modest improvement (FNR = 12%, and FDR = 14%). The residual structure modeling errors are explained by the insufficient information content of these RNAs' SHAPE data, as evaluated by a nonparametric bootstrapping analysis. Beyond these benchmark cases, bootstrapping suggests a low level of confidence (<50%) in the majority of helices in a previously proposed SHAPE-directed model for the HIV-1 RNA genome. Thus, SHAPE-directed RNA modeling is not always unambiguous, and helix-by-helix confidence estimates, as described herein, may be critical for interpreting results from this powerful methodology.Comment: Biochemistry, Article ASAP (Aug. 15, 2011

arXiv.org e-Print Archive

Crossref

Deep learning models for predicting RNA degradation via dual crowdsourcing

Author: Amer Karim
Chiu King Yuen
Das Rhiju
Demkin Maggie
Fares Mohamed
Fujikawa Kazuki
Gao Jiayang
He Shujun
Ishi Keiichiro
Ito Takuya
Kim Do Soon
Kladwang Wipapat
Lee Youhan
Mao Hanfei
Nicol John J.
Noumi Taiga
Onodera Kazuki
Reade Walter
Romano Jonathan
Steenwinckel Bram
Tinti Michele
Tunguz Bojan
Vandewiele Gilles
Watkins Andrew M.
Wayment-Steele Hannah K.
Wellington-Oguri Roger
Öztürk Emin
Öztürk Fatih
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Medicines based on messenger RNA (mRNA) hold immense potential, as evidenced by their rapid deployment as COVID-19 vaccines. However, worldwide distribution of mRNA molecules has been limited by their thermostability, which is fundamentally limited by the intrinsic instability of RNA molecules to a chemical degradation reaction called in-line hydrolysis. Predicting the degradation of an RNA molecule is a key task in designing more stable RNA-based therapeutics. Here, we describe a crowdsourced machine learning competition (‘Stanford OpenVaccine’) on Kaggle, involving single-nucleotide resolution measurements on 6,043 diverse 102–130-nucleotide RNA constructs that were themselves solicited through crowdsourcing on the RNA design platform Eterna. The entire experiment was completed in less than 6 months, and 41% of nucleotide-level predictions from the winning model were within experimental error of the ground truth measurement. Furthermore, these models generalized to blindly predicting orthogonal degradation data on much longer mRNA molecules (504–1,588 nucleotides) with improved accuracy compared with previously published models. These results indicate that such models can represent in-line hydrolysis with excellent accuracy, supporting their use for designing stabilized messenger RNAs. The integration of two crowdsourcing platforms, one for dataset creation and another for machine learning, may be fruitful for other urgent problems that demand scientific discovery on rapid timescales

PubMed Central

University of Dundee Online Publications

Deep learning models for predicting RNA degradation via dual crowdsourcing

Messenger RNA-based medicines hold immense potential, as evidenced by their rapid deployment as COVID-19 vaccines. However, worldwide distribution of mRNA molecules has been limited by their thermostability, which is fundamentally limited by the intrinsic instability of RNA molecules to a chemical degradation reaction called in-line hydrolysis. Predicting the degradation of an RNA molecule is a key task in designing more stable RNA-based therapeutics. Here, we describe a crowdsourced machine learning competition ("Stanford OpenVaccine") on Kaggle, involving single-nucleotide resolution measurements on 6043 102-130-nucleotide diverse RNA constructs that were themselves solicited through crowdsourcing on the RNA design platform Eterna. The entire experiment was completed in less than 6 months, and 41% of nucleotide-level predictions from the winning model were within experimental error of the ground truth measurement. Furthermore, these models generalized to blindly predicting orthogonal degradation data on much longer mRNA molecules (504-1588 nucleotides) with improved accuracy compared to previously published models. Top teams integrated natural language processing architectures and data augmentation techniques with predictions from previous dynamic programming models for RNA secondary structure. These results indicate that such models are capable of representing in-line hydrolysis with excellent accuracy, supporting their use for designing stabilized messenger RNAs. The integration of two crowdsourcing platforms, one for data set creation and another for machine learning, may be fruitful for other urgent problems that demand scientific discovery on rapid timescales

arXiv.org e-Print Archive

PubMed Central

University of Dundee Online Publications

Cryo-EM and antisense targeting of the 28-kDa frameshift stimulation element from the SARS-CoV-2 RNA genome

Author: Baric Ralph S.
Bernardin-Souibgui Claire
Chiu Wah
D&apos
Das Rhiju
Glenn Jeffrey S.
Hagey Rachel J.
Haslecker Raphael
Hou Yixuan J.
Kladwang Wipapat
Kretsch Rachael
Li Shanshan
Pham Edward A.
Pintilie Grigore D.
Rangan Ramya
Sheahan Timothy P.
Wu Marie Teng-Pei
Zhang Kaiming
Zheludev Ivan N.
Publication venue
Publication date: 01/01/2021
Field of study

Drug discovery campaigns against COVID-19 are beginning to target the SARS-CoV-2 RNA genome. The highly conserved frameshift stimulation element (FSE), required for balanced expression of viral proteins, is a particularly attractive SARS-CoV-2 RNA target. Here we present a 6.9 Å resolution cryo-EM structure of the FSE (88 nucleotides, ~28 kDa), validated through an RNA nanostructure tagging method. The tertiary structure presents a topologically complex fold in which the 5′ end is threaded through a ring formed inside a three-stem pseudoknot. Guided by this structure, we develop antisense oligonucleotides that impair FSE function in frameshifting assays and knock down SARS-CoV-2 virus replication in A549-ACE2 cells at 100 nM concentration

PubMed Central

Carolina Digital Repository